A multigrain Delaunay mesh generation method for multicore SMT-based architectures
نویسندگان
چکیده
Given the proliferation of layered, multicoreand SMT-based architectures, it is imperative to deploy and evaluate important, multi-level, scientific computing codes, such as meshing algorithms, on these systems. We focus on Parallel Constrained Delaunay Mesh (PCDM) generation. We exploit coarse-grain parallelism at the subdomain level, medium-grain at the cavity level and fine-grain at the element level. This multi-grain data parallel approach targets clusters built from commercially available SMTs and multicore processors. The exploitation of the coarser degree of granularity facilitates scalability both in terms of execution time and problem size on loosely-coupled clusters. The exploitation of medium-grain parallelism allows performance improvement at the single node level. Our experimental evaluation shows that the first generation of SMT cores is not capable of taking advantage of fine-grain parallelism in PCDM. Many of our experimental findings with PCDM extend to other adaptive and irregular multigrain parallel algorithms as well.
منابع مشابه
Optimizing Irregular Adaptive Applications on Multi-threaded Processors: The Case of Medium-Grain Parallel Delaunay Mesh Generation
The Importance of parallel mesh generation and emerging growth of SMT architectures raise an important question of adapting parallel mesh generation software to the SMT architecture. In this work we focus on Parallel Constrained Delaunay Mesh Generation. We explore medium grain parallelism at the sub-domain level. This parallel approach targets commercially available SMT processors. Our goal is...
متن کاملAlgorithm, software, and hardware optimizations for Delaunay mesh generation on simultaneous multithreaded architectures
This article focuses on the optimization of PCDM, a parallel, two-dimensional (2D) Delaunay mesh generation application, and its interaction with parallel architectures based on simultaneous multithreading (SMT) processors. We first present the step-by-step effect of a series of optimizations on performance. These optimizations improve the performance of PCDM by up to a factor of six. They targ...
متن کاملExperience with Memory Allocators for Parallel Mesh Generation on Multicore Architectures
Scalable and locality-aware multiprocessor memory allocators are critical for harnessing the potential of emerging multithreaded and multicore architectures. This paper evaluates two state-of-the-art generic multithreaded allocators designed for both scalability and locality, against custom allocators, written to optimize the multithreaded implementation of parallel mesh generation algorithms. ...
متن کامل2D Parallel Constrained Delaunay Mesh Generation: A Multigrain Approach on Deep Multiprocessors
Parallel Constrained Delaunay Mesh (PCDM) is a 2D adaptive and irregular meshing algorithm. In PCDM one can explore concurrency using three different levels of granularity: (i) coarse-grain at the sub-mesh level, (ii) medium-grain at the cavity level and (iii) fine-grain at the element level. The mediumand fine-grain approaches can be used to improve the single-processor performance of coarse-g...
متن کاملDesign of a novel congestion-aware communication mechanism for wireless NoC architecture in multicore systems
Hybrid Wireless Network-on-Chip (WNoC) architecture is emerged as a scalable communication structure to mitigate the deficits of traditional NOC architecture for the future Multi-core systems. The hybrid WNoC architecture provides energy efficient, high data rate and flexible communications for NoC architectures. In these architectures, each wireless router is shared by a set of processing core...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Parallel Distrib. Comput.
دوره 69 شماره
صفحات -
تاریخ انتشار 2009